Search CORE

181 research outputs found

Efficient Generator of Mathematical Expressions for Symbolic Regression

Author: Džeroski Sašo
Mežnar Sebastian
Todorovski Ljupčo
Publication venue
Publication date: 10/09/2023
Field of study

We propose an approach to symbolic regression based on a novel variational autoencoder for generating hierarchical structures, HVAE. It combines simple atomic units with shared weights to recursively encode and decode the individual nodes in the hierarchy. Encoding is performed bottom-up and decoding top-down. We empirically show that HVAE can be trained efficiently with small corpora of mathematical expressions and can accurately encode expressions into a smooth low-dimensional latent space. The latter can be efficiently explored with various optimization methods to address the task of symbolic regression. Indeed, random search through the latent space of HVAE performs better than random search through expressions generated by manually crafted probabilistic grammars for mathematical expressions. Finally, EDHiE system for symbolic regression, which applies an evolutionary algorithm to the latent space of HVAE, reconstructs equations from a standard symbolic regression benchmark better than a state-of-the-art system based on a similar combination of deep learning and evolutionary algorithms.\v{z}Comment: 35 pages, 11 tables, 7 multi-part figures, Machine learning (Springer) and journal track of ECML/PKDD 202

arXiv.org e-Print Archive

Repository of the University of Ljubljana

Dejavniki vetroloma na primeru vetroloma na Pokljuki

Author: Džeroski Sašo
Jurc Maja
Ogris Nikica
Publication venue: Gozdarski inštitut Slovenije
Publication date: 12/07/2017
Field of study

This paper presents a case study in windthrow. The case study area was 1.7 ha of two forest gaps on the Pokljuka plateau, Slovenia, where strong wind had blown down 44 trees. An additional 44 standing trees closest to the fallen trees were used as a control group for comparative purposes. The following variables were measured for fallen trees: breast diameter, height, crown diameter and height as well, the number and diameter of roots, the volume of the root system, and root rot. Standing trees were measured for breast diameter, height, crown diameter and height, and the number and diameter of roots. The data were analysed using the machine learning methods in the Weka computer program. The most important factors of windthrow in the case study area were: storm wind (speed above 17 m/s), wet shallow soil, and the edges ofthe forest gaps. The results of the case study show that breast diameter, tree height and the presence of root rot can be classified as windthrow factors.V raziskavi smo izdelali študijo primera vetroloma, ki je zajemala dve vrzeli,veliki 1,7 ha. V vrzelih je viharen veter podrl 44 dreves. Za primerjavo smo vzeli še 44 najbližjih stoječih dreves. Podrtim drevesom smo izmerili prsni premer, višino, širino in višino krošnje, število in debelino korenin, izračunali volumen koreninskega sistema ter vzeli izvrtek, s katerim smo ugotavljali trohnobo. Najbližje stoječim drevesom smo izmerili prsni premer, višino, širino in višino krošnje, število in debelino korenin. Analizopodatkov smo poleg statističnih obdelav izvedli tudi z metodami strojnega učenja v računalniškem programu Weka. Najpomembnejši dejavniki za podrtje dreves na mestu študije primera so bili: viharen veter (hitrost nad 17m/s), razmočena in plitva tla ter gozdni rob vrzeli. Rezultati raziskave so pokazali, da so pomembno vplivali k podrtju dreves tudi prsni premer, višina dreves in trohnoba

Digital repository of Slovenian research organizations

Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification

Author: Ceci Michelangelo
Džeroski Sašo
Kocev Dragi
Levatić Jurica
Publication venue
Publication date: 19/07/2022
Field of study

Semi-supervised learning (SSL) is a common approach to learning predictive models using not only labeled examples, but also unlabeled examples. While SSL for the simple tasks of classification and regression has received a lot of attention from the research community, this is not properly investigated for complex prediction tasks with structurally dependent variables. This is the case of multi-label classification and hierarchical multi-label classification tasks, which may require additional information, possibly coming from the underlying distribution in the descriptive space provided by unlabeled examples, to better face the challenging task of predicting simultaneously multiple class labels. In this paper, we investigate this aspect and propose a (hierarchical) multi-label classification method based on semi-supervised learning of predictive clustering trees. We also extend the method towards ensemble learning and propose a method based on the random forest approach. Extensive experimental evaluation conducted on 23 datasets shows significant advantages of the proposed method and its extension with respect to their supervised counterparts. Moreover, the method preserves interpretability and reduces the time complexity of classical tree-based models

arXiv.org e-Print Archive

Integrating Guidance into Relational Reinforcement Learning

Author: Kurt Driessens
Sašo Džeroski
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref